Sequencing from Compomers: Using Mass Spectrometry for DNA De-Novo Sequencing of 200+ nt

نویسنده

  • Sebastian Böcker
چکیده

One of the main endeavors in today's life science remains the efficient sequencing of long DNA molecules. Today, most de novo sequencing of DNA is still performed using the electrophoresis-based Sanger concept of 1977, in spite of certain restrictions of this method. Methods using mass spectrometry to acquire the Sanger sequencing data are limited by short sequencing lengths of 15-25 nt. We propose a new method for DNA sequencing using base-specific cleavage and mass spectrometry that appears to be a promising alternative to classical DNA sequencing approaches. A single stranded DNA or RNA molecule is cleaved by a base-specific (bio-)chemical reaction using, for example, RNAses. The cleavage reaction is modified such that not all, but only a certain percentage of bases are cleaved. The resulting mixture of fragments is then analyzed using MALDI-TOF mass spectrometry, whereby we acquire the molecular masses of fragments. For every peak in the mass spectrum, we calculate those base compositions that will potentially create a peak of the observed mass and, repeating the cleavage reaction for all four bases, finally try to uniquely reconstruct the underlying sequence from these observed spectra. This leads us to the combinatorial problem of sequencing from compomers and, finally, to the graph-theoretical problem of finding a walk in a subgraph of the de Bruijn graph. Application of this method to simulated data indicates that it might be capable of sequencing DNA molecules with 200+ nt.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weighted Sequencing from Compomers: DNA de-novo sequencing from mass spectrometry data in the presence of false negative peaks

One of the main endeavors in today’s Life Science remains the efficient sequencing of long DNA molecules. Today, most de-novo sequencing of DNA is still performed using electrophoresis-based Sanger Sequencing introduced in 1977, in spite of certain restrictions of this method. Recently, we proposed a new method for DNA sequencing using base-specific cleavage and mass spectrometry, that appears ...

متن کامل

Sequencing from compomers in the presence of false negative peaks

One of the main endeavors in today’s Life Science remains the efficient sequencing of long DNA molecules. Today, most de-novo sequencing of DNA is still performed using electrophoresis-based Sanger Sequencing, based on the Sanger concept of 1977. Methods using mass spectrometry to acquire the Sanger Sequencing data are limited by short sequencing lengths of 15–25 nt. Recently, we proposed a new...

متن کامل

I-37: Establishing High Resolution Genomic Profiles of Single Cells Using Microarray and Next-Generation Sequencing Technologies

The nature and pace of genome mutation is largely unknown. Standard methods to investigate DNA-mutation rely on arraying or sequencing DNA from a population of cells, hence the genetic composition of individual cells is lost and de novo mutation in cell(s) is concealed within the bulk signal. We developed methods based on (SNP-) arraying and next-generation sequencing of single-cell whole-genom...

متن کامل

Multi-spectra peptide sequencing and its applications to multistage mass spectrometry

Despite a recent surge of interest in database-independent peptide identifications, accurate de novo peptide sequencing remains an elusive goal. While the recently introduced spectral network approach resulted in accurate peptide sequencing in low-complexity samples, its success depends on the chance of presence of spectra from overlapping peptides. On the other hand, while multistage mass spec...

متن کامل

AuDeNS: A Tool for Automatic De Novo Peptide Sequencing

We have developed and implemented a framework for de novo sequencing of peptides using tandem mass spectrometry data. It first cleans the input spectrum with a number of data cleaning algorithms (“grass mowers”), followed by a sequencing algorithm that is a modification of a dynamic programming algorithm introduced in [CKT00]. In first experiments, our prototype performs well (but not better) i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 11 6  شماره 

صفحات  -

تاریخ انتشار 2003